高通量测序技术的最新进展使得可以提取多个特征,这些特征描绘了以不同和互补分子水平的患者样本。此类数据的产生导致了计算生物学方面的新挑战,这些挑战涉及捕获多个基因及其功能之间相互关系的高维和异质数据集的整合。由于它们的多功能性和学习复杂数据的合成潜在表示的能力,深度学习方法为整合多词数据提供了有希望的观点。这些方法导致了许多主要基于自动编码器模型的许多原始体系结构的概念。但是,由于任务的困难,集成策略是基本的,而不是失去全球趋势而充分利用来源的特殊性。本文提出了一种新型策略,以构建可自定义的自动编码器模型,该模型适应高维多源集成而言使用的数据集。我们将评估整合策略对潜在代表的影响,并结合提出一种新方法的最佳策略(https://github.com/hakimbenkirane/customics)。我们在这里关注来自多个OMIC来源的数据的集成,并证明了针对多个任务(例如分类和生存分析)的测试用例的拟议方法的性能。
translated by 谷歌翻译
We consider the problem of modelling high-dimensional distributions and generating new examples of data with complex relational feature structure coherent with a graph skeleton. The model we propose tackles the problem of generating the data features constrained by the specific graph structure of each data point by splitting the task into two phases. In the first it models the distribution of features associated with the nodes of the given graph, in the second it complements the edge features conditionally on the node features. We follow the strategy of implicit distribution modelling via generative adversarial network (GAN) combined with permutation equivariant message passing architecture operating over the sets of nodes and edges. This enables generating the feature vectors of all the graph objects in one go (in 2 phases) as opposed to a much slower one-by-one generations of sequential models, prevents the need for expensive graph matching procedures usually needed for likelihood-based generative models, and uses efficiently the network capacity by being insensitive to the particular node ordering in the graph representation. To the best of our knowledge, this is the first method that models the feature distribution along the graph skeleton allowing for generations of annotated graphs with user specified structures. Our experiments demonstrate the ability of our model to learn complex structured distributions through quantitative evaluation over three annotated graph datasets.
translated by 谷歌翻译
图形生成建模中讨论的最多的一个问题之一是表示的排序。一个解决方案包括使用等分性的生成功能,确保排序不变性。在讨论了这种功能的一些性质之后,我们提出了3G-GaN,这是一个依赖于GAN和等价函数的3级模型。该模型仍在开发中。但是,我们展示了一些鼓励探索性实验,并讨论仍有待解决的问题。
translated by 谷歌翻译
本文介绍了使用基于补丁的先前分布的图像恢复的新期望传播(EP)框架。虽然Monte Carlo技术典型地用于从难以处理的后分布中进行采样,但它们可以在诸如图像恢复之类的高维推论问题中遭受可扩展性问题。为了解决这个问题,这里使用EP来使用多元高斯密度的产品近似后分布。此外,对这些密度的协方差矩阵施加结构约束允许更大的可扩展性和分布式计算。虽然该方法自然适于处理添加剂高斯观察噪声,但它也可以扩展到非高斯噪声。用于高斯和泊松噪声的去噪,染色和去卷积问题进行的实验说明了这种柔性近似贝叶斯方法的潜在益处,以实现与采样技术相比降低的计算成本。
translated by 谷歌翻译